Optimze Gelu with MKL Erf function#15770
Conversation
test=develop
… mkl kernel. test=develop
76eaa67 to
5f55ede
Compare
test=develop
|
start a review |
| SET(MKLML_SHARED_IOMP_LIB ${MKLML_LIB_DIR}/libiomp5md.dll) | ||
| ELSE() | ||
| SET(MKLML_VER "mklml_lnx_${TIME_VERSION}" CACHE STRING "" FORCE) | ||
| SET(MKLML_VER "VsErf_mklml_lnx_${TIME_VERSION}" CACHE STRING "" FORCE) |
There was a problem hiding this comment.
Please add a comment to show this is a temporary mklml lib including erf, like
TODO(intel-huying)?
paddle/fluid/operators/math/blas.h
Outdated
| template <typename T> | ||
| void VINV(int n, const T* a, T* y) const; | ||
|
|
||
| #ifdef PADDLE_WITH_MKLML |
There was a problem hiding this comment.
Do not add #ifdef here.
You can make this function general.
Can refer to VMUL
| std::memset(out_data, 0, n * sizeof(T)); | ||
| math::CBlas<T>::AXPY(n, static_cast<T>(M_SQRT1_2), x_data, 1, out_data, 1); | ||
| math::CBlas<T>::VMERF(n, out_data, out_data, VML_LA); | ||
| for (int i = 0; i < n; i++) out_data[i] += static_cast<T>(1); |
There was a problem hiding this comment.
code style
for () {
...
}
same below.
test=develop
test=develop
|
@panyx0718 Please help me review this PR because the key files are changed. [10:23:00] + echo 'current pr 15770 got approvals: FALSE' |
|
新的profile #15301 貌似有dependency问题 http://ci.paddlepaddle.org/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=61983&_focus=9247 |
This reverts commit 676995c.
According to the performance status of Bert model, optimized GELU operator to accelerate the data processing.
Platform: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Model Path: third_party/inference_demo/bert_emb128/model
Batch Size: 1
Command: ./paddle/fluid/inference/tests/api/test_analyzer_bert --infer_model=third_party/inference_demo/bert_emb128/model/ --infer_data=third_party/inference_demo/bert_emb128/data.txt --gtest_filter=Analyzer_bert.profile --paddle_num_threads=1 --repeat=1 --batch_size=1 --test_all_data --profile
Data Source: third_party/inference_demo/bert_emb128/data.txt.
The following is the comparison with the different scenarios.